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Community structure analysis is a powerful tool 
for complex networks, which can simplify their 
functional analysis considerably. Recently, many 
approaches were proposed to community struc- 
ture detection, but few works were focused on 
the significance of community structure. Since 
real networks obtained from complex systems al- 
ways contain error links, and most of the commu- 
nity detection algorithms have random factors, 
evaluate the significance of community structure 
is important and urgent. In this paper, we 
use the eigenvectors' stability to characterize the 
significance of community structures. By em- 
ploying the eigenvalues of Laplacian matrix of a 
given network, we can evaluate the significance 
of its community structure and obtain the opti- 
mal number of communities, which are always 
hard for community detection algorithms. We 
apply our method to many real networks. We 
find that significant community structures exist 
in many social networks and C.elegans neural net- 
work, and that less significant community struc- 
tures appear in protein-interaction networks and 
metabolic networks. Our method can be applied 
to broad clustering problems in data mining due 
to its solid mathematical basis and efficiency. 

Complex networks have become a general tool for the 
analysis of complex systems with many interacting ele- 
ments. The study of the community structure is of great 
importance for complex networks (see [l| as a review). 
Commonly in many real- world networks, some small sub- 
networks (communities) have more connections within 
themselves; but comparatively, they are less likely to be 
connected with the rest parts. Since nodes in a tight-knit 
subnetwork have more properties in common, divide the 
network into such communities could simplify the func- 
tional analysis considerably. As a result, the identifica- 
tion of community structure has been the focus of many 
recent efforts. Generally speaking, such an identification 
contains two problems: One is to detect the community 
structure, which was extensively studied during the re- 
cent 5 years [H-Q • The second is to evaluate its (commu- 
nity structure) significance, which was hardly settled by 
researchers in the past. We believe that some networks 
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have clear communities while others don't. But whether 
the community structure exists in the network or not, 
almost all algorithms could find its "community struc- 
ture"; many algorithms can even find community struc- 
tures in random networks, which are essentially nonex- 
istent at all. Besides, many real-world networks contain 
some error links and algorithms of detecting community 
structure have some random factors Q . How to evaluate 
the effects of error links and random factors in the com- 
munity structure? Therefore, the evaluation of the sig- 
nificance of community structure is imperative. Given a 
network, it is meaningless to detect the community when 
the community structure is not significatc or when just 
few error links can considerably change the community 
structure detected. 

In previous works, only a few methods @, E3, E3 can 
evaluate the significance of community structure, and all 
of them require to know the community structure before 
the evaluation. However, the significance of community 
structure should be the property of network itself, which 
is independent of the partition algorithm, and can be 
evaluated without knowing the exact communities. Ac- 
cording to the well studied bi-communities of network 
[7(, to calculate the significance of community structure 
can be transformed to measure the stability of eigenvec- 
tors. In the following sections, we will extend the bi- 
communities problem to multi-communities problem and 
design an index to evaluate the significance of the com- 
munity structure. Furthermore, we apply the method to 
many types of networks. We find that C. elegans neural 
network and social networks usually have distinct com- 
munity structure, while metabolic networks and protein- 
interaction networks don't. The results are consistent 
with our previous research (Tlj . 



I. METHOD 

How to evaluate the impact of error links and random 
factors of algorithm? The two aspects can be merged into 
one problem. We can regard the random factors of algo- 
rithms as error link liked cases. That is, we can suggest 
that all random factors are caused by error links. If the 
community structure is very clear, a few error links will 
not impact the structure greatly, neither will the random 
factors of algorithm Q. Otherwise, if the community 
structure is fuzzy, few error links will affect the structure 
greatly and the random factors of algorithm will also in- 
duce a big change in community structure. So, the only 
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problem is how to evaluate the effect of error links for 
community structure. We will propose a method to eval- 
uate the significance. The method admits solid mathe- 
matical basis, so that the analysis of significance is easy 
and reliable. Hence, the significance of community struc- 
ture can be evaluated effectively. 

A. Robustness of Community Structure 

We begin by defining the adjacency matrix A of a net- 
work, which consists of elements: Ay = 1 when there is 
an edge joining vertices i and j; otherwise. The corre- 
sponding Laplacian matrix L is defined as: Ljj = —Aij 
if i 7^ j, and Li^ = hi, where ki is the degree of node i. Xt 
is the eigenvalue and is the corresponding eigenvector 
of L. Moreover, we let = Ai < A2 < A3 < ■ • • < \ n , 
vfvj = if i 7^ j, and v^Vj = 1 for all i. In the well 
studied bi-community problem Q (partition the network 
into two communities with pre- knowledge the size of each 
community), the community structure vector s with ele- 
ments Si is defined as: Sj = 1 if node i belongs to commu- 
nity 1 and Si = —1 if node i belongs to community 2. s 
can be written as a linear combination of the normalized 
eigenvectors Vj. Thus, s = J2i=i a i v ij where dj = vfs. 
Since s T s = n, ^2 af = n, the bi-community problem can 
be written as an optimization problem: 

MinZ = s T Ls = o>iXi. (1) 

where jZ is the number of links between the two parti- 
tioned communities. 

To minimize Z is always a tough problem and can be 
equated with the task of choosing the nonnegative quan- 
tities af so as to place as much as possible of the weight in 
the sum in the terms corresponding to the lowest eigen- 
values and as little as possible in the terms corresponding 
to the highest eigenvalues Q . So the above optimization 
problem can be simplified as: 

M inZ ps M axZ = a\\ 2 (2) 

Now we will extend the above bi-community network 
problem to multi-community network one. Suppose that 
a network has n nodes and c communities, and we have 
= Ai < A2 ps A3 ps • • • ps A c < A c +i < • • • < A rl 
@. Si denotes the community vector of community one. 
If node i belongs to community one, Si,, = 1 and — 1 
otherwise. Then ^S^LSi is the number of edges between 
community 1 and the rest of the network. Consequently, 
we can define quantitatively the optimal partition as: 

c 

MinZ = ^SfLSi. (3) 

i=i 

Let S = (Sf,S2,-- - ,S^) T and L = diag(L, L, • • ■ ,L), 
thus, we have 

MinZ = S T LS. (4) 



We can obtain all orthogonal and normalized eigen- 
vectors u q and the corresponding eigenvalues r q of L, 
where q — 1,2, ■•• , n x c. Obviously, each eigenvalue 
of L is L's eigenvalue and repeat c times. Without 
loss of generality, we let r C i- c+ j = Aj,j = 1,2, ••• , c. 
Let SU be the eigenvectors set of the eigenvalues of 
A2, A3, • • • , A c of matrix L. SU can be written as SU = 
{(v£,0--- ,0),--- ,(v^,0,--- ,0),- ; - ,(0, (),••■ ,vj)}, 
where each denote an n-dimensional zero vector and 
SU has c x (c — 1) elements. We can expand SU as a 
space SSU in which each point is the liner combination 
of the elements in set SU. The multi-partition problem 
can be written as: 

n X c 

MinZ = Y h \ T <i ~ Max % = h2 q T i~ ~ X J2 h l 

q=\ u„essu u„essu 

(5) 

where b q = S T u 9 and A is the average value of r c+ i to 
T c xc (also is the average value of A2 to A c ). J2 U essu bq 
denotes the length of vector S projection in space SSU. 
Obviously, the longer the projection is, the nearer S ap- 
proaches the optimal. It is difficult to obtain the op- 
timal S. In this paper, we focus on how to evaluate 
the significance of community structure. Could we avoid 
the tough problem and measure the community struc- 
ture significance? For a network with a clear community 
structure, even if there are a few error links the com- 
munity structure should be change a little. In contrast, 
when its community structure is fuzzy, a few error links 
or a slight perturbation will lead to a big change in the 
community structure. This property should be reflected 
in space SSU. That is, for the same change of links, if 
the community structure is significant, the space SSU 
will change a little; otherwise it will change considerably. 
The space SSU is expanded by the simple combination 
of V2, V3, • • ■ , v c ; therefore, the robustness of space SSU 
equals the robustness of the eigenvalues A2 , A3 , • • ■ , A c 
and eigenvectors V2, V3, • • ■ , v c . 

Suppose that, 5 A is the perturbation links for the orig- 
inal network. Then, we can write 8L, 5\i and Svi as the 
corresponding perturbation of the Laplacian matrix L 
and its eigenvalues and eigenvectors. According to the 
eigenvalue and eigenvector stability theory (l4l |. we have 
the following equations: 

(<5L + L)(6vi + Vi ) = (<5A, + A 4 )(<5v. ( + v< ) (6) 

by deleting the second-order small quantities, we have 

<5Lvi + L(5v, = XiSvi + (5AiV 4 (7) 

after some deductions we obtain: 

5Xi = — L r f — Svi = hijVj (8) 

where, hij = v vV(a^-'a ■) ' ^ ■?')■ Therefore, we have 

|*Ai| < ||£L|| (9) 
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which implies that for any network, no matter the com- 
munity structure is significant or not, the eigenvalues are 
only related to the perturbation strength. In this way, 
the eigenvalues are always stable (So, it is not nec- 
essary to consider the stability of eigenvalues.) 

Without loss of generality, we can let an = an = for 
i 1. Then the comparative error of Vi can be denoted 
as 



\Svi 



<ll <5L 



1 



A,- - A ,- 



(10) 



In Eqfini || SL || is the perturbation strength and 
~YTj^i j=2 a--a ■ | * s ^ ne amplification coefficient which is 
used to measure the stability of Vj. Integrating the sta- 
bility of A2 to A c , we define R as the stability index of 
space SSU. 



R 



J— c+1 1 J 



(11) 



Of cause, R is an important index of the network which 
can be used to measure the significance of community 
structure. 



B. Index of the Significance 

Although R makes sense mathematically, it is not con- 
venient to measure and further compare the significance 
of different networks. In this section, we will define an 
efficient index to measure the community structure sig- 
nificance. Like the definition of temperature, if we know 
the most significant and fuzzy stability values R, the ro- 
bustness can be scaled into interval [0, 1] which will be 
very intuitive to use. 

What kind of network possess the most significant com- 
munity structure? Suppose that the network size is n, 
the average degree is fe and the community number is c, 
where c << n. To find the most significant community 
structure is to solve the following optimization problem: 



MinR= 



1 



i=c+l 



A c +i — A 



(12) 



s.t. 



1 k . 



For the above optimization problem, we directly set 
A = 0. By the Lagrange multiplier method, we ob- 
tain that when A = 0, A c +i, A c +2, ■•• , A n = zr^z, R will 



VH-l) A c+2j ■ • 
achieve it's global minimum value R 



(n- 



A = 



nk k ' 

implies that there are no any connections among com- 
munities and the network is not connected which is not 
suitable for our basic assumption. But this kind of un- 
connected network can be modified slightly to meet our 
requirement. We can generate a network with c = ^xj- 
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FIG. 1: a. The dependence of maximum h and community 
number on artificial connected networks. Given a degree k, 
each community is a completely connected subgraph with fc+1 
nodes. Among the communities there are c — 1 connections 
making the whole network connected, where c is the number 
of communities. From the plot we can see that the maximum 
h is very close to 1. b. The dependence of A2 to A c and 
community size. In this plot, 2C, 3C, 5C denote that there 
are 2,3 and 5 equal communities in the network respectively. 
Each node has 20 expected links with its fellows with the same 
community and 5 expected links with other communities. The 
error bar denotes the standard deviation. We can see that the 
standard deviation of A2 to A c is small and it does not depend 
on the size of community considerably. 



communities, and each community, which is a completely 
connected subgraph, contains fe+1 nodes. Among the c 
communities there are only c— 1, connections which guar- 
antee that the whole network is connected. For this kind 
of network, A w 0, A c +i, A c +2, • • • , A„ ps fe + 1, and the 
corresponding R will achieve the global minimum value 
R ~ x, as shown in Fig. [T]a. 

The spectra properties of complex network matrix have 
been well studied [H, [l]| . They throw a light on the uni- 
versal properties of the eigenvalues' distribution of ran- 
dom spares matrices. We investigate the distribution of 
eigenvalues for different community structures in both 
homogeneous (passion) and heterogeneous (scale free) de- 
gree distribution networks. The results show that the dis- 
tribution of eigenvalues is mainly determined by the aver- 
age degree and degree distribution, and does not relate to 
the community structure considerably (as shown in Fig. 
[2]). Moreover, for networks with different size and com- 
munity structure, we investigate the most relavent eigen- 
values A2, A3, • • • , A c , and we also find that they, staying, 
depend only on the community structure and does not 
related to the size of both community and network (as 
shown in Fig. [T]b). 

According to [12|, Ej, Eq.rrHand FigEJ we have R ocn 
strictly as shown in Fig. [3] From Fig. [3] we can see that, 
for both homogeneous and heterogeneous degree distri- 
bution, the great mass of eigenvalues A are near the aver- 
age degree, although the distribution of eigenvalues can 
not be scaled by the average degree. We have conducted 
many numerical experiments in both homogeneous and 
heterogeneous networks and find that oc k holds well. 
Therefor, given a network with robustness R = h^, when 
the community structure is more significant, h will be 
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FIG. 2: The distribution of laplace matrix's eigenvalues. 
Community structure has no considerable impact on its eigen- 
value's distribution for both homogeneous and heterogeneous 
networks. The eigenvalues' distribution can be scaled by av- 
erage degree for both homogeneous and heterogeneous net- 
works. The relative width of this peak decreases with in- 
creasing average degree, while other qualitative features are 
the same. a. In homogeneous networks, each community is a 
ER network. The legend c = 1 means that the network is a ER 
network without communities. When c = 2, the two commu- 
nities size is 100 and 900 respectively. When c = 3, there are 3 
communities and the communities size is 100, 200, 700. c = 5, 
all the community sizes are 200. We also test many other 
community size distribution and the results are same (not 
yet been shown here), b. We employ the LFR-benchmark 
and the BA model to generate the heterogeneous network. In 
the LFR-benchmark, we set the maximum degree as 50, and 
the maximum and minimum community sizes is 50 and 20 
respectively. 



small. From the above analysis of the clearest commu- 
nity structure in a large enough network, we have a lower 
bound, which almost approaches 1. It is very hard to get 
the h of a fuzziest community structure for that the con- 
tinuous property of matrix spectra is very complicated. 
So, to simplify the index, we define H = ^ = as the 
significance of community structure and H is almost in 
[0, 1] when network size is large enough. 



II. RESULT 

A. Artificial Networks 

Let's test the validity of our index. Firstly, we use 
the classical GN benchmark presented by Girvens and 
Newman Q- Each network has n = 128 nodes that are 
divided into 4 communities with 32 nodes each. Edges 
between two nodes are introduced with different proba- 
bilities which depend on whether the two nodes belong 
to the same community or not. Each node has {ki n ) links 
on average with its fellows in the same community, and 
(k ut) links with the other communities, and we keep 
(kin) + (k ut) = 16. As is well known, the communities 
become fuzzier and thus more difficult to be identified 
when k out increases. Hence, the significance of the com- 
munity structure will also tend to be weaker and the R 
index will decrease. The numerical experiments' results 
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FIG. 3: The relationship of R oc From the four plots we 
can see that R oc 5 is almost validate, a. The dependence 
of R on the network size n. Each community size is the same 
and each node has 20 links to others in the same community 
and 5 links to the outside nodes, c is the community num- 
ber, b. The dependence of R and network size n in the LFR 
benchmark. The average degree k = 20 and P(k) oc fc 2 5 , the 
maximum degree is 50, maximum and minimum community 
size is 20, 50 respectively, c. The dependence of i and aver- 
age degree k. For different network size n, each community 
size are same. d. The dependence of and average degree k 
in the LFR benchmark. The maximum degree is 200 and max- 
imum and minimum community size is 200, 100 respectively. 
Average degree is 20 and P(k) oc fc 2 ' 5 . 



are shown in Fig. 2) We can find that the index H works 
well in the GN-bcnchmark. When community structure 
is very clear, the H is very close to 1; when the network 
is nearly a random one, the corresponding H is near to 
0.3. Thus, we argue that for a given network when the 
corresponding H is larger than 0.3, there exists commu- 
nity structure. Moreover, the larger the H index is, the 
more significant community structure will be. 

We also test the index on the more challenging 
LRF benchmark presented by Lancichinetti, Fortunato, 
Radicchi [l5[. In the LFR benchmark, each node is 
given a degree took from a power law distribution with 
an exponent 7, and the sizes of the communities are 
took from a power law distribution with an exponent /?. 
Moreover, each node shares a fraction 1 — /i of its links 
with other nodes of its community and a fraction \i with 
other nodes in the network. \i is the mixing parameter. 
The community structure significance can be adjusted 
by the mixing parameter \i. The numerical results in the 
LFR-benchmark are shown in Fig. 2] We can see that 
H decreases with the augment of // and H is indepen- 
dent of the community size distribution. Moreover, when 
the power law exponent of degree distribution becomes 
larger, the community structure will be more significant. 
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FIG. 4: The performance of H index in both GN-benchmark 
and LFR-benchmark. In GN-benchmark, we can see that 
H decrease with increasing of fc «t- When the community 
structure is very clear H close to 1 very much, and the network 
close to no community structure network H close to 0.3 which 
implies that for a given network when H is less than 0.3 it is 
not safe to say there exit significant community structure. In 
LFR-benchmark, the average degree k = 20, maximum degree 
is 50 and P(k) oc fc 7 . Maximum and minimum community 
sizes are 50 and 20 respectively, more over Q(m) oc mP where, 
m denotes the community size. We can see that with the 
increase of mix parameter fi, the H index decrease. When 
(i > 0.5 (no significant community) H is near 0.3 which is 
similar with GN-benchmark. 



That more homogenous the degree distribution is, the 
more significant the community structure will be, when 
other conditions are same. 



40 
20 


\ 




k =1,0=4 

out 




20 




4 5 


10 


15 20 




E 15 
10 
20 


r 




put 


4 5 


10 


15 20 








18 
16 






out , 


4 5 10 15 20 

Number of Community 


; 














15 

; 




Zachary karate,c=2~ " » 




3 


4 5 


18 
K 16 
14 




College football,c=12 




5 


10 


12 15 20 


24 
23 
22 , 




Political books,c=3 " * 


3 4 


6 8 


10 12 14 



5 


._ 2 

1 *F 
+ 1. 

2 - 
1 



k =6,c=4 



10 12 14 



dos 



^ 2 

Ut 
+ 

1 

0.5 




.Zachary karate,c=2 



College football,c=l2 



10 12 15 



Political books,c=3 



Number of Community 



B. Real- world Networks 



FIG. 5: The optimal community number. The rang opti- 
mal c is about for all networks. From the plots a and 
b we can find that on GN-benchmark, when c = 4 (pre- 
determined community number) R achive lowest value and 
the corresponding A c +i — A c also achieve the largest value, c 
and d The empirical results of Zachary karate club network, 
College football network and Political books network. It was 
found that Zachary karate club has 2 communities and Col- 
lege football network has 12 communities. Form the plots we 
can see that R achieve its lowest value when the community 
numbers correspond the reality. The corresponding values of 
Aj+i — A, also present reasonable phenomenons. For the Po- 
litical books, we still don't know how many communities, the 
method shows that it has 3 communities. 



Till now, we still haven't discuss how to obtain the 
optimal community number c. For many real-world net- 
works, we don't know the community number before cal- 
culating the index value or partition. Many numerical 
experiments (as shown in Fig. [5]) support that the com- 
munity structure will be most clearest when the commu- 
nity number is the optimal c. So generally speaking, the 
corresponding community number with the lowest R will 
be the optimal c. Moreover, at the optimal c, the value of 
A c +i — A c will be very large comparatively. So we also can 
resort to the differences between and Aj+i to detect 
the optimal c. 

We apply the index to many real networks (see Tab|l] 
and detail information in supplementary). The data are 
taken from the following references and web sites [161 - I241 ] . 
People usually classify the real networks into three cat- 
egories: social networks (such as scientist collaborations 
and friendships) , biological networks (such as proteins in- 
teraction networks and metabolic networks) and techno- 
logical networks (such as Internet and the WWW). First, 
we analyze several social networks, including Zachary 
karate club network |16| , dolphin network [17j , collage 
football network 0, Jazz network [2(| , scientists collab- 
oration network [22( and so on. The results are very 
similar to our previous work fill ]. We find that the 



Jazz community structure is the most significant one, the 
Santa Fe scientists collaboration network and the Politi- 
cal blogs network are insignificant comparatively. Gener- 
ally speaking, the community structure is most notable 
in social networks. Moreover, we analyze some biolog- 
ical networks such as proteins interaction networks (E. 
coli [23|, Yeast 24 and H. Sapiens (HI), many metabolic 
networks [24[ and C.elegans neural network. We find that 
in proteins interaction networks, E.coli is 0.14, H. Sapiens 
0.21, and Yeast 0.40, which is high and different from the 
previous results. In metabolic networks, the H index of 
Aquifcx aeolicus, Helicobacter pylori and Yersinia pcstis 
are all 0.36, which are consistent to previous works. But 
for the C.elegans metabolic and neural network , signifi- 
cance is 0.62, which is very high and different from pre- 
vious work due to it is not easy to obtain the proper 
community number c (see supplementary). The signifi- 
cance of C.elegans neural is 0.57, which corresponds to 
previous work well. 
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TABLE I: The R and H indexes of some real networks. R in- 
dexes is the robustness of community structure, which can be 
obtain be perturbation (please see ref [ll[). The table shows 
the names of different real networks and the corresponding 
index values. 



network 


size 


S 


R 


H 


type 


E.coli 


1442, 5873 


61.30 


0.11 


0.14 




Yeast 


1458, 1971 


112.95 


0.12 


0.40 


protein 


H. Sapiens 


693, 982 


38.48 


0.18 


0.21 




C.elegans metabolic 


453, 2032 


19.25 


0.17 


0.62 




Aquifex aeolicus 


1473, 3354 


68.39 


0.17 


0.36 




Helicobacter pylori 


1341, 3087 


62.76 


0.17 


0.36 


metabolic 


Yersinia pestis 


1922, 4389 


108.84 


0.15 


0.36 




43 metabolic networks 


1472, 3395 


71.25 


0.17 


0.36 




C.elegans neural 


297, 2148 


5.52 


0.22 


0.52 


neural 


Santa Fe scientists 


118, 200 


2.45 


0.27 


0.72 




Zachary karate 


34, 78 


0.32 


0.25 


0.46 




Dolphin 


62, 159 


2.07 


0.24 


0.42 




College football 


115, 613 


1.67 


0.34 


0.79 
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Political books 
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1.63 


0.31 


0.32 





III. CONCLUSION AND DISCUSSION 

In this paper, an index to evaluate the significance 
of community structure without knowing the commu- 



nity structure is proposed. We transform the problem 
of community structure significance into the problem of 
the stability of eigenvalues and eigenvectors of the Lapla- 
cian matrix. The index of community structure signif- 
icance admits sound mathematical basis, which makes 
the index is reliable. According to the index, the optimal 
community number can also be obtained before partition, 
which is nearly impossible for many partition algorithms. 
Moreover, we apply the index to many real world net- 
works, such as social networks, neural network, protein- 
interaction networks and metabolic networks. We find 
that in social networks, the significance of community 
structure is usually high, C.elegans metabolic and neural 
networks they are very hight, and in protein interaction 
and some other metabolical, they are comparative low. 



Acknowledgement 



Yanqing Hu wishes to thank Prof. Shlomo Havlin for 
very useful discussions, Dr. Erbo Zhao for his help in 
compiling LFR-benchmark and Dan Bu for some help in 
English writing. This work is partially supported by 985 
Project and NSFC under the grant No. 70771011, and 
No. 60534080. 



[1] S. Fortunato. [arXiv:0906.06T2V l. (2009). 

[2] Wu, F. and Huberman, B. A. (2004) Finding communi- 
ties in linear time: a physics approach. Eur. Phys. J. B. 
38:331-338. 

[3] Newman, M. E. J. (2006) Finding community structure 
in networks using the eigenvectors of matrices. Phys. Rev. 
E. 74: 036104. 

[4] Girvan, M. and Newman, M. E. J. (2002) Community 

structure in social and biological networks. Proc. Natl. 

Acad. 99: 7821-7826. 
[5] Donetti, L. and Munoz, M. A. (2004) Detecting network 

communities: a new systematic and efficient algorithm. 

J. Stat. Mech. P10012. 
[6] Wang, X., Li, X. and Cheng, G. (2006) Complex network 

thory and application. Tsinghua University Press. Page 

162-193. 

[7] Newman, M. E. J. (2006) Modularity and community 
structure in networks. Proc. Natl. Acad. 103: 8577-8582 

[8] Fan, Y., Li, M., Zhang, P., Wu, J. and Di, Z. (2007) 
Accuracy and precision of methods for community iden- 
tification in weighted networks. Physica A 377: 363-372 

[9] G. Bianconi, G., Pin, P. and Marsili, M. (2009) Assess- 
ing the relevance of node features for network structure. 
Proc. Natl. Acad. 106: 11433-11438. 
[10] Gfeller, D., Chappelier, J.-C. and de Los Rios, P. (2005) 
Finding instabilities in the community strucuture of com- 
plex networks. Phys. Rev. E 72: 056135. 



[11] Hu, Y., Nie, Y., Yang, Y., Cheng, J., Fan, Y. and Di, Z. 
Measuring Significance of Community Structure in Com- 
plex Networks. larXiv:0906.0493l (2009) 

[12] McGraw, P. N. and Menzinger, M. (2008) Laplacian spec- 
tra as a diagnostic tool for network structure and dynam- 
ics. Phys. Rev. E. 77: 031102. 

[13] Dorogovtsev, S. N., Goltsev, A. V., Mendes, J. F. and 
Samukhin, A. N. (2003) Spectra of complex networks. 
Phys. Rev. E 68: 046109. 

[14] Faddeev I, D. K., FaDDeeva, V. N. (1965) Calculation 
method of linear algebra. Shanghai Science and Technol- 
ogy Press (Translated into Chinese by Li, G. et al). 

[15] Lancichinetti, A., Fortunato, F. and Radicchi, F. (2008) 
Benchmark graphs for testing community detection algo- 
rithms. Phys. Rev. E. 78: 046110. 

[16] Zachary, W. W. Journal of Anthropological Research 33: 
452-473 (1977). 

[17] Lusseau, D., Schneider, K., Boisseau, O. J., Haase, P., 
Slooten, E. and Dawson, S. M. (2003) The bottlenose 
dolphin community of Doubtful Sound features a large 
proportion of long-lasting associations. Behavioral Ecol- 
ogy and Sociobiology 54: 396-405. 

[18] Adamic, L. A. and Glance, N. (2005) The Political Blogo- 
sphere and the 2004 U.S. Election: Divided They Blog. 
The political blogosphere and the 2004 US Election, in 
Proceedings of the WWW-2005 Workshop on the Weblog- 
ging Ecosystem. 

[19] Watts, D. J. and Strogatz, S. H. (1998) Collective dy- 



7 



namics of 'small-world'networks. Nature 393: 440-442. [24] |http : / / www. nd . edu/ networks 

[20] Gleiser, P. and L. Danon, L. (2003) Community structure 

in jazz. Adv. Complex Syst. 6: 565. 
[21] J. Duch and A. Arenas, Phys. Rev. E. 72: 027104, ( 2005). 
[22] |http://www-personal.umich.edu/^m ejn/net data/| 

[23] Database of Interact ing Proteins (DIP). IV - SUPPLEMENTARY 

http:/ /dip. doe- mbi.ucla.edu 




Aquifex aeolicus 




40 60 80 100 




90 
88 
86 
84 
82 
80 



1000 
950 
900' 

850 

900 
880 
860 



C.elegans Metabolic 



I I » 
II 

4 



5 10 20 30 40 
Number of Community 



Aquifex aeolicus 



i 
i 

»\ 
i 

i 



20 40 60 80 100 
Number of Community 



840 
820* 

800 



Helicobacter pylori 



i v 



40 60 80 100 

i 



2 10 20 30 40 50 60 
Number of Community 




1300 
1250 
12004 
1150 
1100 



Yersinia pestis 



40 60 80 100 



2 10 20 30 40 50 60 
Number of Communitv 



0.015 



0.01 



0.005 



2 15 50 



100 



Yeast 




150 



1400 



1200 



Pi 1000 



800 



600 




2 15 50 100 150 

Number of Community 



0.06 
0.05 
0.04 



rt 0.03 

+ 

0.02 
0.01 




0.02 



0.015 



16 



I 

i — l 

+ 

•<S> 



0.01 



0.005 



50 




140 



1500 
1450 
1400 
1350 
1300 
1250 



11 
II 
1 1 
1 


E.coli 


r i 
i 

i «« x 
I ' \ 
i ' * 




•» * \ 
\ i \ 
\ i \ - 

1 I 

+' 





16 40 60 80 100 120 140 
Community Number 




\ 


H. Sapiens 


\ 

\ 

\ 
\ 

\ 

\ 

\ 

\ 




\ 

\ 









4 10 20 30 40 50 60 
Number of Community 





5 10 16 20 25 30 
Number of Community 



10 



0.15 



I 

i— l 

+ 




0.05 




100 130 




32 




3 4 5 

Number of Community 



340 
320 
300 
280 
260 

800 
700 
600 
500 
400 
3004' 
200 



Email 



6 20 40 60 80 100 
Number of Community 



Political Blogs 



2 20 40 60 80 100 120 
Number of Community 



140 
120 
^100 
80 
60 



SFI 



10 15 20 30 
Community Number 



